The researchers at Palmer Station Antarctica have set out to collect data about 3 major Antarctic penguin species; adelie, chinstrap and gentoo. The researchers have collected various quantitatie measurements about the penguins including flipper length, bill depth and bill length among other data points such a sex and island.
The researchers would like to obtain some insight from this data. They have tasked you, the reader, with gaining the following insight about the palmer penguin dataset:
The Palmer penguin data set has several columns, but for our analysis we’ll only need certain columns. First we’ll load the libraries we’ll be using to explore the data from the tidyverse and the dataset palmerpenguins.
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
##
## Attaching package: 'bslib'
##
##
## The following object is masked from 'package:utils':
##
## page
## Warning: package 'plotly' was built under R version 4.3.2
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
## # A tibble: 6 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen 36.7 19.3 193 3450
## 5 Adelie Torgersen 39.3 20.6 190 3650
## 6 Adelie Torgersen 38.9 17.8 181 3625
## # ℹ 2 more variables: sex <fct>, year <int>
## # A tibble: 6 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Chinstrap Dream 45.7 17 195 3650
## 2 Chinstrap Dream 55.8 19.8 207 4000
## 3 Chinstrap Dream 43.5 18.1 202 3400
## 4 Chinstrap Dream 49.6 18.2 193 3775
## 5 Chinstrap Dream 50.8 19 210 4100
## 6 Chinstrap Dream 50.2 18.7 198 3775
## # ℹ 2 more variables: sex <fct>, year <int>
## Rows: 333
## Columns: 8
## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…
## $ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…
## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, 36.7, 39.3, 38.9, 39.2, 41.1, 38.6…
## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, 19.3, 20.6, 17.8, 19.6, 17.6, 21.2…
## $ flipper_length_mm <int> 181, 186, 195, 193, 190, 181, 195, 182, 191, 198, 18…
## $ body_mass_g <int> 3750, 3800, 3250, 3450, 3650, 3625, 4675, 3200, 3800…
## $ sex <fct> male, female, female, female, male, female, male, fe…
## $ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…
## [1] "species" "island" "bill_length_mm"
## [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
## [7] "sex" "year"
Confirming that we do not see any errors or potential complications to our data that require cleaning, we can select the columns we would like to use for our analysis:
## Rows: 333
## Columns: 6
## $ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…
## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, 36.7, 39.3, 38.9, 39.2, 41.1, 38.6…
## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, 19.3, 20.6, 17.8, 19.6, 17.6, 21.2…
## $ flipper_length_mm <int> 181, 186, 195, 193, 190, 181, 195, 182, 191, 198, 18…
## $ body_mass_g <int> 3750, 3800, 3250, 3450, 3650, 3625, 4675, 3200, 3800…
## $ sex <fct> male, female, female, female, male, female, male, fe…
Now, we will analyze the data through graphical representation. We’ll use ggplot2 to accomplish our goals.
From the data we can see that there is a correlation between the overall mass of the penguin and flipper length acrross species. Furthermore, males on average show longer flipper length to total body mass. The summary of these observations can be seen in the table below:
| species | No. Inviduals | Average Flipper Length (mm) |
|---|---|---|
| Adelie | 73 | 192.4110 |
| Chinstrap | 34 | 199.9118 |
| Gentoo | 61 | 221.5410 |
| species | No. Inviduals | Average Flipper Length (mm) |
|---|---|---|
| Adelie | 73 | 187.7945 |
| Chinstrap | 34 | 191.7353 |
| Gentoo | 58 | 212.7069 |
Bill depth between species was most similar between adelie and chinstrap penguins. There is an on average graeter bill depth to beak length among males in all species. Males also have the larger and deeper beaks between sexes as well. These observations are also summarized in teh table below:
| species | No. Individuals | Average Bill Depth (mm) | Average Bill Length (mm) |
|---|---|---|---|
| Adelie | 73 | 19.07260 | 40.39041 |
| Chinstrap | 34 | 19.25294 | 51.09412 |
| Gentoo | 61 | 15.71803 | 49.47377 |
| species | No. Individuals | Average Bill Depth (mm) | Average Bill Length (mm) |
|---|---|---|---|
| Adelie | 73 | 17.62192 | 37.25753 |
| Chinstrap | 34 | 17.58824 | 46.57353 |
| Gentoo | 58 | 14.23793 | 45.56379 |
Step 5 | Share
This has been made available on GitHub for viewers to see, make comments on and see the R scripts used to analyse the data. This data is also public, and I expect many similar reports exist among the internet. However, Palmer penguins data is exceptionally fun to play with as I have a special interest in Antarctica myself. There will be no final phase (step 6) to ‘act’ on the data.
citation(package = "palmerpenguins")
## To cite palmerpenguins in publications use:
##
## Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer
## Archipelago (Antarctica) penguin data. R package version 0.1.0.
## https://allisonhorst.github.io/palmerpenguins/. doi:
## 10.5281/zenodo.3960218.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {palmerpenguins: Palmer Archipelago (Antarctica) penguin data},
## author = {Allison Marie Horst and Alison Presmanes Hill and Kristen B Gorman},
## year = {2020},
## note = {R package version 0.1.0},
## doi = {10.5281/zenodo.3960218},
## url = {https://allisonhorst.github.io/palmerpenguins/},
## }
citation(package = 'bslib')
## To cite package 'bslib' in publications use:
##
## Sievert C, Cheng J, Aden-Buie G (2023). _bslib: Custom 'Bootstrap'
## 'Sass' Themes for 'shiny' and 'rmarkdown'_. R package version 0.5.0,
## <https://CRAN.R-project.org/package=bslib>.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {bslib: Custom 'Bootstrap' 'Sass' Themes for 'shiny' and 'rmarkdown'},
## author = {Carson Sievert and Joe Cheng and Garrick Aden-Buie},
## year = {2023},
## note = {R package version 0.5.0},
## url = {https://CRAN.R-project.org/package=bslib},
## }
citation(package = 'tidyverse')
## To cite package 'tidyverse' in publications use:
##
## Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R,
## Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller
## E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V,
## Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to
## the tidyverse." _Journal of Open Source Software_, *4*(43), 1686.
## doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
##
## A BibTeX entry for LaTeX users is
##
## @Article{,
## title = {Welcome to the {tidyverse}},
## author = {Hadley Wickham and Mara Averick and Jennifer Bryan and Winston Chang and Lucy D'Agostino McGowan and Romain François and Garrett Grolemund and Alex Hayes and Lionel Henry and Jim Hester and Max Kuhn and Thomas Lin Pedersen and Evan Miller and Stephan Milton Bache and Kirill Müller and Jeroen Ooms and David Robinson and Dana Paige Seidel and Vitalie Spinu and Kohske Takahashi and Davis Vaughan and Claus Wilke and Kara Woo and Hiroaki Yutani},
## year = {2019},
## journal = {Journal of Open Source Software},
## volume = {4},
## number = {43},
## pages = {1686},
## doi = {10.21105/joss.01686},
## }